Search CORE

145 research outputs found

Detect Any Deepfakes: Segment Anything Meets Face Forgery Detection and Localization

Author: Lai Yingxin
Luo Zhiming
Yu Zitong
Publication venue
Publication date: 29/06/2023
Field of study

The rapid advancements in computer vision have stimulated remarkable progress in face forgery techniques, capturing the dedicated attention of researchers committed to detecting forgeries and precisely localizing manipulated areas. Nonetheless, with limited fine-grained pixel-wise supervision labels, deepfake detection models perform unsatisfactorily on precise forgery detection and localization. To address this challenge, we introduce the well-trained vision segmentation foundation model, i.e., Segment Anything Model (SAM) in face forgery detection and localization. Based on SAM, we propose the Detect Any Deepfakes (DADF) framework with the Multiscale Adapter, which can capture short- and long-range forgery contexts for efficient fine-tuning. Moreover, to better identify forged traces and augment the model's sensitivity towards forgery regions, Reconstruction Guided Attention (RGA) module is proposed. The proposed framework seamlessly integrates end-to-end forgery localization and detection optimization. Extensive experiments on three benchmark datasets demonstrate the superiority of our approach for both forgery detection and localization. The codes will be released soon at https://github.com/laiyingxin2/DADF

arXiv.org e-Print Archive

Diagnosis and Treatment of Tracheal or Bronchuotracheal Adenoid Cystic Carcinoma

Author: Daping YU
Ming HAN
Ming QIN
Shaofa XU
Yu FU
Zitong WANG
Publication venue: Chinese Anti-Cancer Association; Chinese Antituberculosis Association
Publication date: 01/06/2010
Field of study

Background and objective Adenoid cystic carcinoma is primary bronchopulmonary carcinoma with low malignancy, and 43 patients treated in the past 50 years in our hospital were retrospectively studied. The aim of this study is to discuss the clinical symptoms, pathologic characteristic and therapeutic method of primary tracheal or bronchuotracheal adenoid cystic carcinoma. Methods This study summarized total 43 patients of primary tracheal or bronchus adenoid cystic carcinoma treated in our hospital from Jan. 1958 to Dec. 2007. Among them, 40 patients were treated by surgical resection, and 3 patients were treated by fiberoptic bronchoscope’s interventional treatment. Results The 1-yr, 3-yr, 5-yr survival rates of the 43 patients above were 100% (41/41), 89.5% (34/38), 87.1% (27/31), respectively. Conclusion Primary tracheal or bronchus adenoid cystic carcinoma are rare and low malignancy carcinoma. The clinical symptoms of them are not typical. The best treatment is early detection and taking measures of operation plus radiotherapy. The other palliative treatment is fiberoptic bronchoscope’s interventional treatment

Directory of Open Access Journals

Learning Meta Model for Zero- and Few-shot Face Anti-spoofing

Author: Fu Tianyu
Lei Zhen
Qin Yunxiao
Shi Jingping
Wang Zezheng
Yu Zitong
Zhao Chenxu
Zhou Feng
Zhu Xiangyu
Publication venue
Publication date: 01/12/2019
Field of study

Face anti-spoofing is crucial to the security of face recognition systems. Most previous methods formulate face anti-spoofing as a supervised learning problem to detect various predefined presentation attacks, which need large scale training data to cover as many attacks as possible. However, the trained model is easy to overfit several common attacks and is still vulnerable to unseen attacks. To overcome this challenge, the detector should: 1) learn discriminative features that can generalize to unseen spoofing types from predefined presentation attacks; 2) quickly adapt to new spoofing types by learning from both the predefined attacks and a few examples of the new spoofing types. Therefore, we define face anti-spoofing as a zero- and few-shot learning problem. In this paper, we propose a novel Adaptive Inner-update Meta Face Anti-Spoofing (AIM-FAS) method to tackle this problem through meta-learning. Specifically, AIM-FAS trains a meta-learner focusing on the task of detecting unseen spoofing types by learning from predefined living and spoofing faces and a few examples of new attacks. To assess the proposed approach, we propose several benchmarks for zero- and few-shot FAS. Experiments show its superior performances on the presented benchmarks to existing methods in existing zero-shot FAS protocols.Comment: Accepted by AAAI202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning

Author: Huang Xiaobin
Ren Jianfeng
Shen Linlin
Yu Zitong
Zhang Yaning
Publication venue
Publication date: 02/02/2024
Field of study

The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable. Thus, benchmarking and advancing techniques detecting digital manipulation become an urgent issue. Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology, which does not involve the most recent technologies like diffusion. The diversity and quality of images generated by diffusion models have been significantly improved and thus a much more challenging face forgery dataset shall be used to evaluate SOTA forgery detection literature. In this paper, we propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection, which contains a large number of forgery faces generated by advanced generators such as the diffusion-based model and more detailed labels about the manipulation approaches and adopted generators. In addition to evaluating SOTA approaches on our benchmark, we design an innovative cross appearance-edge learning (CAEL) detector to capture multi-grained appearance and edge global representations, and detect discriminative and general forgery traces. Moreover, we devise an appearance-edge cross-attention (AECA) module to explore the various integrations across two domains. Extensive experiment results and visualizations show that our detection model outperforms the state of the arts on different settings like cross-generator, cross-forgery, and cross-dataset evaluations. Code and datasets will be available at \url{https://github.com/Jenine-321/GenFac

arXiv.org e-Print Archive

PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer

Author: Shen Yuming
Shi Jingang
Torr Philip
Yu Zitong
Zhao Guoying
Zhao Hengshuang
Publication venue
Publication date: 01/01/2022
Field of study

Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e.g., remote healthcare and affective computing). Recent deep learning approaches focus on mining subtle rPPG clues using convolutional neural networks with limited spatio-temporal receptive fields, which neglect the long-range spatio-temporal perception and interaction for rPPG modeling. In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture, to adaptively aggregate both local and global spatio-temporal features for rPPG representation enhancement. As key modules in PhysFormer, the temporal difference transformers first enhance the quasi-periodic rPPG features with temporal difference guided global attention, and then refine the local spatio-temporal representation against interference. Furthermore, we also propose the label distribution learning and a curriculum learning inspired dynamic constraint in frequency domain, which provide elaborate supervisions for PhysFormer and alleviate overfitting. Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra- and cross-dataset testings. One highlight is that, unlike most transformer networks needed pretraining from large-scale datasets, the proposed PhysFormer can be easily trained from scratch on rPPG datasets, which makes it promising as a novel transformer baseline for the rPPG community. The codes will be released at https://github.com/ZitongYu/PhysFormer.Comment: Accepted by CVPR202

arXiv.org e-Print Archive

Oxford University Research Archive

University of Oulu Repository - Jultika

Hyperbolic Face Anti-Spoofing

Author: Cai Rizhao
Cui Yawen
Han Shuangpeng
Hu Yongjian
Kot Alex
Yu Zitong
Publication venue
Publication date: 17/08/2023
Field of study

Learning generalized face anti-spoofing (FAS) models against presentation attacks is essential for the security of face recognition systems. Previous FAS methods usually encourage models to extract discriminative features, of which the distances within the same class (bonafide or attack) are pushed close while those between bonafide and attack are pulled away. However, these methods are designed based on Euclidean distance, which lacks generalization ability for unseen attack detection due to poor hierarchy embedding ability. According to the evidence that different spoofing attacks are intrinsically hierarchical, we propose to learn richer hierarchical and discriminative spoofing cues in hyperbolic space. Specifically, for unimodal FAS learning, the feature embeddings are projected into the Poincar\'e ball, and then the hyperbolic binary logistic regression layer is cascaded for classification. To further improve generalization, we conduct hyperbolic contrastive learning for the bonafide only while relaxing the constraints on diverse spoofing attacks. To alleviate the vanishing gradient problem in hyperbolic space, a new feature clipping method is proposed to enhance the training stability of hyperbolic models. Besides, we further design a multimodal FAS framework with Euclidean multimodal feature decomposition and hyperbolic multimodal feature fusion & classification. Extensive experiments on three benchmark datasets (i.e., WMCA, PADISI-Face, and SiW-M) with diverse attack types demonstrate that the proposed method can bring significant improvement compared to the Euclidean baselines on unseen attack detection. In addition, the proposed framework is also generalized well on four benchmark datasets (i.e., MSU-MFSD, IDIAP REPLAY-ATTACK, CASIA-FASD, and OULU-NPU) with a limited number of attack types

arXiv.org e-Print Archive

rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for Remote Physiological Measurement

Author: Liu Xin
Lu Hao
Yang Jingyu
Yu Zitong
Yue Huanjing
Zhang Yuting
Publication venue
Publication date: 04/06/2023
Field of study

Remote photoplethysmography (rPPG) is an important technique for perceiving human vital signs, which has received extensive attention. For a long time, researchers have focused on supervised methods that rely on large amounts of labeled data. These methods are limited by the requirement for large amounts of data and the difficulty of acquiring ground truth physiological signals. To address these issues, several self-supervised methods based on contrastive learning have been proposed. However, they focus on the contrastive learning between samples, which neglect the inherent self-similar prior in physiological signals and seem to have a limited ability to cope with noisy. In this paper, a linear self-supervised reconstruction task was designed for extracting the inherent self-similar prior in physiological signals. Besides, a specific noise-insensitive strategy was explored for reducing the interference of motion and illumination. The proposed framework in this paper, namely rPPG-MAE, demonstrates excellent performance even on the challenging VIPL-HR dataset. We also evaluate the proposed method on two public datasets, namely PURE and UBFC-rPPG. The results show that our method not only outperforms existing self-supervised methods but also exceeds the state-of-the-art (SOTA) supervised methods. One important observation is that the quality of the dataset seems more important than the size in self-supervised pre-training of rPPG. The source code is released at https://github.com/linuxsino/rPPG-MAE

arXiv.org e-Print Archive